Precision-Recall
The goal of this notebook is to use the GHSL Surface and Population raster datasets as ground truths for identifying human habitation. These will be cross-referenced with nighttime light data from the Black Marble and SDGSat satellites. Predictions will be generated across different thresholds for both satellite products to assess their robustness and to identify potential optimal threshold values for habitation detection.
The analysis starts with a rural case study in Nyala province, Sudan, and then moves on to investigate whether the results change in an urban setting like Winterthur, Switzerland. This approach helps to highlight differences in how nighttime light and built-up surface datasets perform across distinct geographic contexts.
import numpy as np
import holoviews as hv
import hvplot.xarray # noqa
import hvplot.pandas # noqa
from rasterio.enums import Resampling
from conflict_monitoring_ntl.case_studies import get_county_ids, get_date
from conflict_monitoring_ntl.satellites import (
BlackMarble,
GHSLSurface,
SDGSat,
GHSLPopulation,
)
from conflict_monitoring_ntl.viz import plot_tile_comparison, plot_binary
from conflict_monitoring_ntl.transform import RasterPipeline
from conflict_monitoring_ntl.utils import (
get_gdf_for_admin,
get_combined_mask,
get_non_nan_flat_array,
binarize_xarray,
get_precision_recall
)
Nyala - Sudan¶
country = "Sudan"
date, county_id = get_date(country), get_county_ids(country)[0]
county_gdf = get_gdf_for_admin(county_id)
Black Marble¶
GHSL Surface¶
The GHSL Surface dataset at 100m resolution encodes each pixel value as the total built-up surface in square meters within that 100m × 100m grid cell, where each cell represents a maximum of 10,000 m². The pixel value thus indicates how much of the cell’s area is covered by built structures—values range from 0 (no built-up surface) to 10,000 (fully built-up).
This dataset is derived through the automated processing of Landsat and Sentinel imagery, with built-up surfaces detected using machine learning and image classification algorithms designed to identify manmade structures.
sources: Google Eart Engine and Joint Research Centre Data Catalogue
rasters = [BlackMarble(), GHSLSurface()]
transformations = [{"reproject_match": {"resampling": Resampling.bilinear}}, {}]
pipeline = RasterPipeline(county_gdf, date, rasters, transformations)
ds = pipeline.run()
list(ds.data_vars.keys())
['black_marble_radiance', 'ghsl_surface']
The literature commonly adopts a 50% threshold to classify a GHSL built-up surface pixel as built-up, primarily in urban contexts (source). However, this threshold is often unsuitable for rural areas, as illustrated by the chart below. Therefore, for the GHSL Surface dataset at 100m resolution, I use a threshold of 0, meaning that any detectable built-up area within a pixel classifies it as "True." This approach better captures dispersed and low-density settlements typical of rural environments.
thresholds = np.arange(0, 5001, 500)
binary_dict = {t: plot_binary(ds.ghsl_surface, "GHSL Surface", t) for t in thresholds}
hv.HoloMap(binary_dict, kdims='threshold')
plot_tile_comparison(
arr_left=ds.ghsl_surface,
arr_right=ds.black_marble_radiance,
title_left="GHSL Surface",
title_right="Black Marble radiance (nW/cm²/sr)",
clim_left=(0, 50),
clim_right=(0, 2)
)
mask = get_combined_mask(ds)
ds = ds.where(mask)
ghsl_pop_binary = binarize_xarray(ds.ghsl_surface, 0)
y_true = get_non_nan_flat_array(ghsl_pop_binary)
bm_thresholds = np.arange(0.1, 2.01, 0.01).tolist()
df = get_precision_recall(ds.black_marble_radiance, y_true, bm_thresholds)
df.head(5)
| precision | recall | threshold | |
|---|---|---|---|
| 0 | 0.018686 | 0.973177 | 0.10 |
| 1 | 0.018668 | 0.968081 | 0.11 |
| 2 | 0.018662 | 0.963271 | 0.12 |
| 3 | 0.018663 | 0.958094 | 0.13 |
| 4 | 0.018665 | 0.951898 | 0.14 |
df.hvplot.scatter(
x="recall",
y="precision",
c="threshold",
height=400,
width=500,
clabel="threshold",
cmap="cividis"
)
As shown by the precision-recall curve above, Black Marble nighttime lights provide almost no predictive power for detecting built-up surfaces in rural areas. This likely occurs because low-density settlements, sparse infrastructure, and limited artificial lighting in rural environments often fall below the sensor’s detection threshold, resulting in weak or absent nighttime light signals—even when built-up structures are present. Additionally, rural areas may exhibit lights too faint or intermittent to be reliably captured and distinguished from natural background noise.
plot_tile_comparison(
arr_left=binarize_xarray(ds.ghsl_surface, 0),
arr_right=binarize_xarray(ds.black_marble_radiance, 1.0),
title_left="GHSL Surface",
title_right="Black Marble binary (threshold 1.0)",
colorbar=False
)
Interestingly, the binary classifications derived from GHSL Surface and Black Marble nighttime lights do not align, as also reflected in the radiance plot above. This discrepancy suggests that the ground truth may be less definitive than assumed, since the GHSL Surface data represents deduced built-up areas rather than directly measured artificial illumination.
GHSL Population¶
The GHSL Population dataset at 100m resolution encodes each pixel value as the estimated number of people residing within that 100m × 100m grid cell.
Population estimates are derived by spatially disaggregating census and administrative data. Locations and densities of residential population are inferred based on both the extent and intensity of built-up land identified in satellite imagery, with adjustments made through machine learning and modeling to match known administrative totals.
sources: Joint Research Centre Data Catalogue
rasters = [BlackMarble(), GHSLPopulation()]
transformations = [{"reproject_match": {"resampling": Resampling.bilinear}}, {}]
pipeline = RasterPipeline(county_gdf, date, rasters, transformations)
ds = pipeline.run()
Because a percentage threshold is not applicable for population rasters, a simple threshold of zero will be used. In this context, any pixel with a population value greater than zero is classified as "True." The chart below demonstrates the binary classification outcomes across varying threshold values.
thresholds = np.arange(0, 501, 100)
binary_dict = {t: plot_binary(ds.ghsl_population, "GHSL Population", t) for t in thresholds}
hv.HoloMap(binary_dict, kdims='threshold')
mask = get_combined_mask(ds)
ds = ds.where(mask)
plot_tile_comparison(
arr_left=ds.ghsl_population,
arr_right=ds.black_marble_radiance,
title_left="GHSL Population",
title_right="Black Marble radiance (nW/cm²/sr)",
clim_left=(0, 50),
clim_right=(0, 2)
)
A similar pattern emerges with the GHSL Population dataset, where areas of population clustering frequently do not overlap with regions displaying detectable nighttime lights.
ghsl_pop_binary = binarize_xarray(ds.ghsl_population, 0)
y_true = get_non_nan_flat_array(ghsl_pop_binary)
bm_thresholds = np.arange(0.1, 2.01, 0.01).tolist()
df = get_precision_recall(ds.black_marble_radiance, y_true, bm_thresholds)
df.hvplot.scatter(
x="recall",
y="precision",
c="threshold",
height=400,
width=500,
clabel="threshold",
cmap="cividis"
)
Given the evident mismatch between population clusters identified by GHSL Population and areas illuminated in nighttime light imagery, the predictive power of nighttime lights for detecting population is similarly negligible—just as observed with built-up surface data.
SDGSat¶
The same analysis is now applied to SDGSat-1 nighttime light data. Since there was no difference in results between the population and surface datasets with Black Marble, the following assessment will focus solely on the surface dataset.
GHSL Surface¶
rasters = [SDGSat(), GHSLSurface()]
transformations = [{"reproject_match": {"resampling": Resampling.bilinear}}, {}]
pipeline = RasterPipeline(county_gdf, date, rasters, transformations)
ds = pipeline.run()
plot_tile_comparison(
arr_left=ds.ghsl_surface,
arr_right=ds.sdgsat_dn,
title_left="GHSL Surface",
title_right="SDGSat DN",
clim_left=(0, 50),
clim_right=(1, 10)
)
mask = get_combined_mask(ds)
ds = ds.where(mask)
ghsl_pop_binary = binarize_xarray(ds.ghsl_surface, 0)
y_true = get_non_nan_flat_array(ghsl_pop_binary)
sdgsat_thresholds = np.arange(1.01, 2.01, 0.01).tolist()
df = get_precision_recall(ds.sdgsat_dn, y_true, sdgsat_thresholds)
df.hvplot.scatter(
x="recall",
y="precision",
c="threshold",
height=400,
width=500,
clabel="threshold",
cmap="cividis"
)
The results with SDGSat-1 are similar to those found with Black Marble: precision remains below 10 percent, and pixels identified as lighted by SDGSat do not overlap with built-up surface pixels.
Winterthur - Switzerland¶
I will apply the same analysis in Winterthur, an urban area near Zurich, Switzerland.
import datetime
import pygadm
date = datetime.date(2022, 3, 22)
county_gdf = pygadm.Items(name="Winterthur", content_level=2)
county_gdf = county_gdf.set_crs("EPSG:4326")
Black Marble¶
rasters = [BlackMarble(), GHSLSurface()]
transformations = [{"reproject_match": {"resampling": Resampling.bilinear}}, {}]
pipeline = RasterPipeline(county_gdf, date, rasters, transformations)
ds = pipeline.run()
plot_tile_comparison(
arr_left=ds.ghsl_surface,
arr_right=ds.black_marble_radiance,
title_left="GHSL Surface",
title_right="Black Marble radiance (nW/cm²/sr)",
clim_left=(0, 5000),
clim_right=(0, 20)
)
mask = get_combined_mask(ds)
ds = ds.where(mask)
ghsl_pop_binary = binarize_xarray(ds.ghsl_surface, 0)
y_true = get_non_nan_flat_array(ghsl_pop_binary)
bm_thresholds = np.arange(0.1, 2.01, 0.01).tolist()
df = get_precision_recall(ds.black_marble_radiance, y_true, bm_thresholds)
df.hvplot.scatter(
x="recall",
y="precision",
c="threshold",
height=400,
width=500,
clabel="threshold",
cmap="cividis"
)
plot_tile_comparison(
arr_left=binarize_xarray(ds.ghsl_surface, 0),
arr_right=binarize_xarray(ds.black_marble_radiance, 1.0),
title_left="GHSL Surface",
title_right="Black Marble binary (threshold 1.0)",
colorbar=False
)
SDGSat¶
rasters = [SDGSat(), GHSLSurface()]
transformations = [{"reproject_match": {"resampling": Resampling.bilinear}}, {}]
pipeline = RasterPipeline(county_gdf, date, rasters, transformations)
ds = pipeline.run()
plot_tile_comparison(
arr_left=ds.ghsl_surface,
arr_right=ds.sdgsat_dn,
title_left="GHSL Surface",
title_right="SDGSat DN",
clim_left=(0, 5000),
clim_right=(0, 20)
)
mask = get_combined_mask(ds)
ds = ds.where(mask)
ghsl_pop_binary = binarize_xarray(ds.ghsl_surface, 0)
y_true = get_non_nan_flat_array(ghsl_pop_binary)
sdgsat_thresholds = np.arange(1.01, 2.01, 0.01).tolist()
df = get_precision_recall(ds.sdgsat_dn, y_true, sdgsat_thresholds)
df.hvplot.scatter(
x="recall",
y="precision",
c="threshold",
height=400,
width=500,
clabel="threshold",
cmap="cividis"
)
plot_tile_comparison(
arr_left=binarize_xarray(ds.ghsl_surface, 0),
arr_right=binarize_xarray(ds.sdgsat_dn, 1.5),
title_left="GHSL Surface",
title_right="SDGSat binary (threshold 1.5)",
colorbar=False
)
For both Black Marble and SDGSat-1, the results in Winterthur are significantly better than in rural Sudan. This indicates that in dense urban environments, nighttime light detections from both satellites have much higher overlap with built-up areas, making them more reliable proxies for urban habitation and infrastructure.